Audio Signal De-Identification

ABSTRACT

Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional of commonly-owned and co-pending U.S.patent application Ser. No. 11/064,343, filed on Feb. 23, 2005,entitled, “Audio Signal De-Identification.”

This application is related to the following commonly-owned U.S. patentapplications, both of which are hereby incorporated by reference:

Ser. No. 10/923,517, filed on Aug. 20, 2004, entitled “AutomatedExtraction of Semantic Content and Generation of a Structured Documentfrom Speech”; and

Ser. No. 10/922,513, filed on Aug. 20, 2004, entitled “DocumentTranscription System Training.”

BACKGROUND

1. Field of the Invention

The present invention relates to techniques for performing automatedspeech recognition and, more particularly, to techniques for removingpersonally identifying information from data used in human-assistedtranscription services.

2. Related Art

It is desirable in many contexts to generate a written document based onhuman speech. In the legal profession, for example, transcriptioniststranscribe testimony given in court proceedings and in depositions toproduce a written transcript of the testimony. Similarly, in the medicalprofession, transcripts are produced of diagnoses, prognoses,prescriptions, and other information dictated by doctors and othermedical professionals.

At first, transcription was performed solely by human transcriptionistswho would listen to speech, either in real-time (i.e., in person by“taking dictation”) or by listening to a recording. One benefit of humantranscriptionists is that they may have domain-specific knowledge, suchas knowledge of medicine and medical terminology, which enables them tointerpret ambiguities in speech and thereby to improve transcriptaccuracy.

It is common for hospitals and other healthcare institutions tooutsource the task of transcribing medical reports to a MedicalTranscription Service Organization (MTSO). For example, referring toFIG. 1, a diagram is shown of the typical dataflow in a conventionalmedical transcription system 100 using an outsourced MTSO. A physician102 dictates notes 104 into a dictation device 106, such as a digitalvoice recorder, personal digital assistant (PDA), or a personal computerrunning dictation software. The dictation device 106 stores the spokennotes 104 in a digital audio file 108.

The audio file 108 is transmitted to a data server 110 at the MTSO. Notethat if the dictation device 106 is a telephone, the audio file 108 neednot be stored at the site of the physician 102. Rather, the telephonemay transmit signals representing the notes 104 to the data server 110,which may generate and store the audio file 108 at the site of the MTSOdata server 110.

The MTSO may interface to a hospital information system (HIS) database112 which includes demographic information regarding, for example, thedictating physician 102 (such as his or her name, address, andspecialty), the patient (such as his or her name, date of birth, andmedical record number), and the encounter (such as a work type and nameand address of a referring physician). Optionally, the MTSO data server110 may match the audio file 108 with corresponding demographicinformation 114 from the HIS database 112 and transmit the audio file108 and matched demographic information 114 to a medicaltranscriptionist (MT) 116. Various techniques are well-known formatching the audio file 108 with the demographic information 114. Thedictation device 106 may, for example, store meta-data (such as the nameof the physician 102 and/or patient) which may be used as a key into thedatabase 112 to identify the corresponding demographic information 114.

The medical transcriptionist 116 may transcribe the audio file 108(using the demographic information 114, if it is available, as an aid).The medical transcriptionist 116 transmits the report 118 back to theMTSO data server 110. Although not shown in FIG. 1, the draft report 118may be verified and corrected by a second medical transcriptionist toproduce a second draft report. The MTSO (through the data server 110 orsome other means) transmits a final report 122 back to the physician102, who may further edit the report 122.

Sensitive information about the patient (such as his or her name,history, and name/address of physician) may be contained within thenotes 104, the audio file 108, the demographic information 114, thedraft report 118, and the final report 122. As a result, increasinglystringent regulations have been developed to govern the handling ofpatient information in the context illustrated by FIG. 1. Even so,sensitive patient information may travel through many hands during thetranscription process. For example, the audio file 108 andadmission-discharge-transmission (ADT) information may be transferredfrom the physician 102 or HIS database 112 to the off-site MTSO dataserver 110. Although the primary MTSO data server 110 may be locatedwithin the U.S., an increasing percentage of data is forwarded from theprimary data server 110 to a secondary data server (not shown) inanother country such as India, Pakistan, or Indonesia, where non-U.S.persons may have access to sensitive patient information. Even if dataare stored by the MTSO solely within the U.S., non-U.S. personnel of theMTSO may have remote access to the data. Furthermore, the audio file 108may be distributed to several medical transcriptionists before the finalreport 122 is transmitted back to the physician 102. All sensitivepatient information may be freely accessible to all handlers during thetranscription process.

What is needed, therefore, are improved techniques for maintaining theprivacy of patient information during the medical transcription process.

SUMMARY

Techniques are disclosed for automatically de-identifying audio signals.In particular, techniques are disclosed for automatically removingpersonally identifying information from spoken audio signals andreplacing such information with non-personally identifying information.De-identification of an audio signal may be performed by automaticallygenerating a report based on the spoken audio signal. The report mayinclude concept content (e.g., text) corresponding to one or moreconcepts represented by the audio signal. The report may also includetimestamps indicating temporal positions of speech in the audio signalthat corresponds to the concept content. Concept content that representspersonally identifying information is identified. Portions of the audiosignal that correspond to the personally identifying concept content areremoved from the audio signal. The removed portions may be replaced withnon-personally identifying audio signals.

For example, in one aspect of the present invention, techniques areprovided for: (A) identifying a first portion of an original audiosignal, the first portion representing sensitive information, such aspersonally identifying information; and (B) producing a modified audiosignal in which the identified first portion is protected againstunauthorized disclosure.

The identified first portion may be protected in any of a variety ofways, such as by removing the identified first portion from the originalaudio signal to produce the modified audio signal, whereby the modifiedaudio signal does not include the identified first portion.

Alternatively, for example, a security measure may be applied to theidentified first portion to produce the modified audio signal, whereinthe identified first portion in the modified audio signal is protectedagainst unauthorized disclosure. The security measure may, for example,include encrypting the identified first portion.

The first portion may be identified in any of a variety of ways, such asby identifying a candidate portion of the original audio signal,determining whether the candidate portion represents personallyidentifying information, and identifying the candidate portion as thefirst portion if the candidate portion represents personally identifyinginformation.

Furthermore, the first audio signal portion may be replaced with asecond audio signal portion that does not include personally identifyinginformation. The second audio signal may, for example, be a non-speechaudio signal or an audio signal representing a type of conceptrepresented by the identified portion.

The first portion may, for example, be identified by: (1) generating areport, the report comprising: (a) content representing information inthe original audio signal, and (b) at least one timestamp indicating atleast one temporal position of at least one portion of the originalaudio signal corresponding to the content; (2) identifying a firstpersonally identifying concept in the report; (3) identifying a firsttimestamp in the report corresponding to the first personallyidentifying concept; and (4) identifying a portion of the original audiosignal corresponding to the first personally identifying concept byusing the first timestamp. The first personally identifying concept maybe removed from the report to produce a de-identified report.

The de-identified audio signal may be transcribed to produce atranscript of the de-identified audio signal. The transcript may, forexample, be a literal or non-literal transcript of the de-identifiedaudio signal. The transcript may, for example, be produced using anautomated speech recognizer.

Other features and advantages of various aspects and embodiments of thepresent invention will become apparent from the following descriptionand from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of the typical dataflow in a conventional medicaltranscription system using an outsourced Medical Transcription ServiceOrganization (MTSO);

FIGS. 2A-2B are diagrams of a system for de-identifying a spoken audiosignal according to one embodiment of the present invention;

FIG. 2C is a block diagram illustrating the operation of thede-identifier of FIG. 2B in greater detail according to one embodimentof the present invention;

FIG. 2D is a block diagram illustrating an audio file including bothpersonally identifying information and non-personally identifyinginformation according to one embodiment of the present invention;

FIG. 2E is a block diagram illustrating an audio file including onlynon-personally identifying information according to one embodiment ofthe present invention;

FIG. 3A is a flowchart of a method for de-identifying an audio signalaccording to one embodiment of the present invention; and

FIG. 3B is a flowchart of alternative techniques for performing aportion of the method of FIG. 3A according to one embodiment of thepresent invention.

DETAILED DESCRIPTION

The term “personally identifying information” refers herein to anyinformation that identifies a particular individual, such as a medicalpatient. For example, a person's name is an example of personallyidentifying information. The Health Insurance Portability andAccountability Act of 1996 (HIPAA) includes a variety of regulationsestablishing privacy and security standards for personally identifyinghealth care information. For example, HIPAA requires that certainpersonally identifying information (such as names and birthdates) beremoved from text reports in certain situations. This process is oneexample of “de-identification.” More generally, the term“de-identification” refers to the process of removing, generalizing, orreplacing personally identifying information so that the relevant datarecords are no longer personally identifying. Typically, however,personally identifying information is not removed from audio recordingsbecause it would be prohibitively costly to do so using conventionaltechniques.

In embodiments of the present invention, techniques are provided forperforming de-identification of audio recordings and other audiosignals. In particular, techniques are disclosed for removing certainpre-determined data elements from audio signals.

For example, referring to FIGS. 2A-2B, a diagram is shown of thedataflow in a medical transcription system 200 using an outsourced MTSOaccording to one embodiment of the present invention. FIG. 2Aillustrates a first portion 200 a of the system 200, while FIG. 2Billustrates a second (partially overlapping) portion 200 b of the system200. Referring to FIG. 3A, a flowchart is shown of a method 300performed by the system 200 according to one embodiment of the presentinvention.

In one embodiment of the present invention, a physician 202 dictatesnotes 204 into a dictation device 206 to produce an audio file 208, asdescribed above with respect to FIG. 1. The audio file 208 istransmitted to a data server 210 at the MTSO. A report generator 230receives the audio file 208 (step 302). Note that the report generator230 may reside at the site of the MTSO. The report generator 230generates a concept-marked report 232 based on the audio file 208 and(optionally) the demographic information 214 (step 304).

The report generator 230 may, for example, generate the concept-markedreport 232 using the techniques disclosed in the above-referenced patentapplication entitled “Automated Extraction of Semantic Content andGeneration of a Structured Document from Speech.” The report 232 may,for example, include a literal or non-literal transcript of the audiofile 208. Text in the report 232 that represents concepts, such asnames, dates, and addresses, may be marked so that such “concept text”may be identified and processed automatically by a computer. Forexample, in one embodiment of the present invention, the report 232 isan Extensible Markup Language (XML) document and concept text in thereport 232 is marked using XML tags. Concepts in the report 232 may berepresented not only by text but also by other kinds of data. Therefore,more generally the report generator 230 generates “concept content”representing concepts that appear in the audio file 208 (step 306). Thereport may include not only concepts but also plain text, such as textcorresponding to speech in the audio file 208 which the report generator230 does not identify as corresponding to a concept.

The above-referenced patent application entitled “Automated Extractionof Semantic Content and Generation of a Structured Document from Speech”further describes the use of language models that are based on “conceptgrammars.” The report generator 230 may include a speech recognizerwhich uses such language models to generate the concept-marked report232. Those grammars can be configured using the demographic informationthat was provided with the audio recording 208. For example, a birthdaygrammar may be configured to expect the particular birthday of thepatient that is the subject of the audio recording 208. Although suchcustomization is not required, it may increase the accuracy of speechrecognition and the subsequent report 232. If no such demographicinformation is available, generic slot fillers may be used. For example,name lists of the most frequent first and last names may stand in formissing patient name information.

The MTSO data server 110 may match the audio file 208 with correspondingdemographic information 214 a received from the HIS database 212 oranother source and transmit the audio file 208 and matched demographicinformation 214 a to the report generator 230. Such demographicinformation 214 a may assist the report generator 230 in identifyingconcepts in the audio file 208. Use of the demographic information 214 aby the report generator 230 is not, however, required.

The report generator 230 may also generate timestamps for each of theconcept contents generated in step 306 (step 308). The timestampsindicate the temporal positions of speech in the audio file 208 thatcorresponds to the concept contents in the report 232.

Referring to FIG. 2B, the system 200 also includes an audiode-identifier 234 which receives as its input the audio file 208, theconcept-marked report 232, and optionally the demographic information214 a. The de-identifier 234 removes personally-identifying informationfrom the audio file 208 and thereby produces a de-identified audio file236 which does not include personally identifying information (step310). The de-identifier 234 may also remove personally identifyinginformation from the demographic information 214 a to producede-identified demographic information 214 b. The demographic information214 a may typically be de-identified easily because it is provided in aform (such as an XML document) in which personally identifyinginformation is marked as such. Referring to FIG. 2C, a block diagram isshown which illustrates the operation of the de-identifier 234 in moredetail according to one embodiment of the present invention.

In the example illustrated in FIG. 2C, the concept-marked report 232includes three concept contents 252 a-c and corresponding timestamps 254a-c. Although in practice the concept-marked report 232 may include alarge number of marked concepts, only three concept contents 252 a-c areillustrated in FIG. 2C for ease of illustration and explanation. Assumefor purposes of example that the concept contents 252 a-c correspond tosequential and adjacent portions of the audio file 208. For example,assume that the audio file 208 contains the speech “Patient RichardJames, dictation of progress note, date of birth Mar. 28, 1967,” thatconcept content 252 a is the text “Richard James,” that concept content252 b is the text “dictation of progress note,” and that the conceptcontent 252 c is the date Mar. 28, 1967.

In the example just described, concept content 252 a represents apatient name, which is an example of personally identifying information;concept content 252 b represents the type of document being created,which is an example of non-personally identifying information; andconcept content 252 c represents the patient's birthday, which is anexample of personally identifying information. Note that in this examplethe text “dictation of progress note” may be further subdivided into thenon-concept speech “dictation of” and the non-personally identifyingconcept “progress note” (which is an example of a work type concept).For ease of explanation, however, the text “dictation of progress note”will simply be described herein as a non-personally identifying concept.

Referring to FIG. 2D, a diagram is shown illustrating the audio file 208in the example above. Time advances in the direction of arrow 271. Theaudio signal 208 includes three portions 270 a-c, corresponding toconcept contents 252 a-c, respectively. In other words, portion 270 a isan audio signal (e.g., the speech “Patient Richard James”) correspondingto concept content 252 a (e.g., the structured text “<PATIENT>RichardJames</PATIENT>”); portion 270 b is an audio signal (e.g., the speech“dictation of progress note”) corresponding to concept content 252 b(e.g., the structured text “dictation of<WORKTYPE>PROGRESSNOTE</WORKTYPE>”); and portion 270 c is an audiosignal (e.g., the speech “Mar. 28, 1967”) corresponding to conceptcontent 252 c (e.g., the structured text“<DATE><MONTH>3</MONTH><DAY>28</DAY><YEAR>1967</YEAR></DATE>”).

Portions 270 a and 270 c, which correspond to a patient name and patientbirthday, respectively, are labeled as “personally identifying” audiosignals in FIG. 2D, while portion 270 b, which corresponds to the dateof an examination, is labeled as a “non-personally identifying” audiosignal in FIG. 2D. Note that the particular choice of concepts whichqualify as personally identifying and non-personally identifying mayvary from application to application. The particular choices used in theexamples herein are not required by the present invention.

Timestamps 254 a-c indicate the temporal positions of speech in theaudio file 208 that corresponds to concept contents 252 a-c,respectively. For example, timestamp 254 a indicates the start and endtime of portion 270 a; timestamp 254 b indicates the start and end timeof portion 270 b; and timestamp 254 c indicates the start and end timeof portion 270 c. Note that timestamps 254 a-c may be represented in anyof a variety of ways, such as by start and end times or by start timesand durations.

Returning to FIG. 2C and FIG. 3A, the de-identifier 234 performsde-identification on the audio file 208 by identifying concept contentsin the report 232 which represent personally identifying information(step 312). Concept contents representing personally identifyinginformation may be identified in any of a variety of ways. For example,a set of personally identifying concept types 258 may indicate whichconcept types qualify as “personally identifying.” For example, thepersonally identifying concept types 258 may indicate concept types suchas patient name, patient address, and patient date of birth. Concepttypes 258 may be represented using the same markers (e.g., XML tags)that are used to mark concepts in the report 232. A personallyidentifying information identifier 256 may identify as personallyidentifying concept content 259 any concept contents in the report 232having the same type as any of the personally identifying concept types258. For example, assuming that the personally identifying concept types258 include patient name and patient date of birth but not examinationdate, the personally identifying information identifier 256 identifiesthe concept content 252 a (e.g., “<PATIENT>Richard James</PATIENT>”) andthe concept content 252 c(“<DATE><MONTH>3</MONTH><DAY>28</DAY><YEAR>1967</YEAR></DATE>” aspersonally identifying concept content 259.

The de-identifier 234 may include a personally identifying informationremover 260 which removes any personally identifying information fromthe audio file 208 (step 314). The remover 260 may perform such removalby using the timestamps 254 a, 254 c in the personally identifyingconcept content 259 to identify the corresponding portions 270 a, 270 c(FIG. 2D) of the audio file 208, and then removing such portions 270 a,270 c.

The remover 260 may also replace the removed portions 270 a, 270 c withnon-personally identifying information (step 316) to produce thede-identified audio file 236. A set of non-personally identifyingconcepts 262, for example, may specify audio signals to substitute forone or more personally identifying concept types. The remover 260 mayidentify substitute audio signals 272 a, 272 b corresponding to thepersonally identifying audio signals 270 a, 270 c and replace thepersonally identifying audio signals 270 a, 270 c with the correspondingsubstitute audio signals 272 a, 272 b. The result, as shown in FIG. 2E,is that the de-identified audio file 236 includes only non-personallyidentifying audio signals 272 a, 270 b, and 272 b.

Any of a variety of substitute audio signals may be used to replacepersonally identifying audio signals in the audio file 208. For example,in one embodiment of the present invention, the de-identifier 234replaces all personally identifying audio signals in the audio file 208with a short beep, thereby indicating to the transcriptionist 216 (orother listener) that part of the audio file 208 has been suppressed.

In another embodiment of the present invention, each of the personallyidentifying audio portions 270 a, 270 c is replaced with an audio signalthat indicates the type of concept that has been replaced. For example,a particular family name (e.g., “James”) may be replaced with thegeneric audio signal “family name,” thereby indicating to the listenerthat a family name has been suppressed in the audio file 208.

In yet another embodiment of the present invention, each of thepersonally identifying audio portions 270 a, 270 c is replaced with anaudio signal that indicates both the type of concept that has beenreplaced and an identifier of the replaced concept which distinguishesthe audio portion from other audio portions representing the same typeof concept. For example, the first family name that is replaced may bereplaced with the audio signal “family name 1,” while the second familyname that is replaced may be replaced with the audio signal “family name2.” Assuming that the audio signal “family name 1” replaced the familyname “James,” subsequent occurrences of the same family name (“James”)may be replaced with the same replacement audio signal (e.g., “familyname 1”).

Although not shown in FIGS. 2A-2E, the audio file 208 may include aheader portion. The header portion may include such dictated audiosignals as the name of the physician and the name of the patient. Inanother embodiment of the present invention, the de-identifier 234removes the header portion of the audio file 208 regardless of thecontents of the remainder of the audio file 208.

The de-identifier 234 may also perform de-identification on theconcept-marked report 232 to produce a de-identified concept-markedreport 238 (step 318). De-identification of the concept-marked report232 may be performed by replacing personally identifying text or othercontent in the report 232 with non-personally identifying text or othercontent. For example, the concept contents 252 a and 252 c in the report232 may be replaced with non-personally identifying contents in thede-identified report 238. The non-personally identifying content that isplaced in the de-identified report 238 may match the audio that isplaced in the de-identified audio file 236. For example, if the spokenaudio “James” is replaced with the spoken audio “family name” in thede-identified audio file 236, then the text “James” may be replaced withthe text “family name” in the de-identified report 238.

The de-identifier 234 provides the de-identified audio file 236, andoptionally the demographic information 214 b and/or the de-identifiedconcept-marked report 238, to a medical transcriptionist 216 (step 320).The medical transcriptionist 216 transcribes the de-identified audiofile 236 to produce a draft report 218 (step 322). The transcriptionist216 may produce the draft report 218 by transcribing the de-identifiedaudio file 236 from scratch, or by beginning with the de-identifiedconcept-marked report 238 and editing the report 238 in accordance withthe de-identified audio file 236. The medical transcriptionist 216 maybe provided with a set of pre-determined tags (e.g., “FamilyName2”) touse as substitutes for beeps and other de-identification markers in thede-identified audio file 236. If the transcriptionist 216 cannotidentify the type of concept that has been replaced with ade-identification marker, the transcriptionist 216 may insert a specialmarker into the draft report requesting further processing by a person(such as the physician 202) who has access to the original audio file208.

One advantage of various embodiments of the present invention is thatthey enable the statistical de-identification of audio recordings andother audio signals. As described above, conventional de-identificationtechniques are limited to use for de-identifying text documents. Failureto de-identify audio recordings, however, exposes private information inthe process of generating transcripts in systems such as the one shownin FIG. 1. By applying the audio de-identification techniques disclosedherein, transcription work can be outsourced without raising privacyconcerns.

In particular, the techniques disclosed herein enable a division oflabor to be implemented which protects privacy while maintaining a highdegree of transcription accuracy. For example, the physician's healthcare institution typically trusts that the U.S.-based operations of theMTSO will protect patient privacy because of privacy regulationsgoverning the MTSO in the U.S. The audio file 208, which containspersonally identifying information may therefore be transmitted by thephysician 202 to the MTSO data server 210 with a high degree of trust.The MTSO may then use the de-identifier to produce the de-identifiedaudio file 236, which may be safely shipped to an untrusted offshoretranscriptionist without raising privacy concerns. Because the MTSOmaintains the private patient information (in the audio file 208) in theU.S., the MTSO may use such information in the U.S. to verify theaccuracy of the concept-marked report 232 and the draft report 218.Transcript accuracy is therefore achieved without requiring additionaleffort by the physician 102 or health care institution, and withoutsacrificing patient privacy.

It is to be understood that although the invention has been describedabove in terms of particular embodiments, the foregoing embodiments areprovided as illustrative only, and do not limit or define the scope ofthe invention. Various other embodiments, including but not limited tothe following, are also within the scope of the claims. For example,elements and components described herein may be further divided intoadditional components or joined together to form fewer components forperforming the same functions.

Examples of concept types representing personally identifyinginformation include, but are not limited to, name, gender, birth date,address, phone number, diagnosis, drug prescription, and social securitynumber. The term “personally identifying concept” refers herein to anyconcept of a type that represents personally identifying information.

Although particular examples disclosed herein involve transcribingmedical information, this is not a requirement of the present invention.Rather, the techniques disclosed herein may be applied within fieldsother than medicine where de-identification of audio signals is desired.

Furthermore, the techniques disclosed herein may be applied not only topersonally identifying information, but also to other kinds of sensitiveinformation, such as classified information, that may or may not bepersonally identifying. The techniques disclosed herein may be used todetect such information and to remove it from an audio file to protectit from disclosure.

Although in the example described above with respect to FIGS. 2D-2E allof the personally identifying information in the audio file 208 wasremoved, this is not a requirement of the present invention. It may notbe possible or feasible to identify and remove all personallyidentifying information in all cases. In such cases, the techniquesdisclosed herein may remove less than all of the personally identifyinginformation in the audio file 208. As a result, the de-identified audiofile 236 may include some personally identifying information. Suchpartial de-identification may, however, still be valuable because it maysubstantially increase the difficulty of correlating the remainingpersonally identifying information with personally identifyinginformation in other data sources.

Furthermore, referring to FIG. 3B, a flowchart is shown of analternative method for implementing step 310 of FIG. 3A according to oneembodiment of the present invention. The method identifies conceptcontents representing sensitive information (step 330) and applies asecurity measure to the sensitive information to protect it againstunauthorized disclosure in the de-identified audio file 236 (step 332).Although the security measure may involve removing the sensitiveinformation, as described above with respect to FIG. 3A, other securitymeasures may be applied. For example, the sensitive information may beencrypted in, rather than removed from, the de-identified audio file236. Sensitive information may also be protected against unauthorizeddisclosure by applying other forms of security to the information, suchas by requiring a password to access the sensitive information. Any kindof security scheme, such as a multi-level security scheme which providesdifferent access privileges to different users, may be applied toprotect the sensitive information against unauthorized disclosure. Suchschemes do not require that the sensitive information be removed fromthe file to protect it against unauthorized disclosure. Note that inaccordance with such schemes, the entire audio file 236 (including bothsensitive and non-sensitive information) may be encrypted, in which caseaccess to the sensitive information may be selectively granted only tothose users with sufficient access privileges.

Although particular examples described herein refer to protectinginformation about U.S. persons against disclosure to non-U.S. persons,the present invention is not limited to providing this kind ofprotection. Rather, any criteria may be used to determine who should bedenied access to sensitive information. Examples of such criteriainclude not only geographic location, but also job function and securityclearance status.

The techniques described above may be implemented, for example, inhardware, software, firmware, or any combination thereof. The techniquesdescribed above may be implemented in one or more computer programsexecuting on a programmable computer including a processor, a storagemedium readable by the processor (including, for example, volatile andnon-volatile memory and/or storage elements), at least one input device,and at least one output device. Program code may be applied to inputentered using the input device to perform the functions described and togenerate output. The output may be provided to one or more outputdevices.

Each computer program within the scope of the claims below may beimplemented in any programming language, such as assembly language,machine language, a high-level procedural programming language, or anobject-oriented programming language. The programming language may, forexample, be a compiled or interpreted programming language.

Each such computer program may be implemented in a computer programproduct tangibly embodied in a machine-readable storage device forexecution by a computer processor. Method steps of the invention may beperformed by a computer processor executing a program tangibly embodiedon a computer-readable medium to perform functions of the invention byoperating on input and generating output. Suitable processors include,by way of example, both general and special purpose microprocessors.Generally, the processor receives instructions and data from a read-onlymemory and/or a random access memory. Storage devices suitable fortangibly embodying computer program instructions include, for example,all forms of non-volatile memory, such as semiconductor memory devices,including EPROM, EEPROM, and flash memory devices; magnetic disks suchas internal hard disks and removable disks; magneto-optical disks; andCD-ROMs. Any of the foregoing may be supplemented by, or incorporatedin, specially-designed ASICs (application-specific integrated circuits)or FPGAs (Field-Programmable Gate Arrays). A computer can generally alsoreceive programs and data from a storage medium such as an internal disk(not shown) or a removable disk. These elements will also be found in aconventional desktop or workstation computer as well as other computerssuitable for executing computer programs implementing the methodsdescribed herein, which may be used in conjunction with any digitalprint engine or marking engine, display monitor, or other raster outputdevice capable of producing color or gray scale pixels on paper, film,display screen, or other output medium.

1. A method comprising steps of: (A) identifying a first portion of anoriginal audio signal, the first portion representing personallyidentifying content, by performing steps of: (A)(1) generating a report,the report comprising: (a) content representing information in theoriginal audio signal, and (b) at least one timestamp indicating atleast one temporal position of at least one portion of the originalaudio signal corresponding to the content; (A)(2) identifying a firstpersonally identifying concept in the report; (A)(3) identifying a firsttimestamp in the report corresponding to the first personallyidentifying concept; (A)(4) identifying a portion of the original audiosignal corresponding to the first personally identifying concept byusing the first timestamp; and (B) removing the identified first portionfrom the original audio signal to produce a modified audio signal, whichdoes not include the identified first portion.
 2. The method of claim 1,further comprising a step of: (C) transcribing the de-identified audiosignal to produce a transcript of the de-identified audio signal.
 3. Asystem comprising: identification means comprising means for identifyinga first portion of an original audio signal, the first portionrepresenting personally identifying content, the identification meanscomprising: means for generating a report, the report comprising: (a)content representing information in the original audio signal, and (b)at least one timestamp indicating at least one temporal position of atleast one portion of the original audio signal corresponding to thecontent; means for identifying a first personally identifying concept inthe report; means for identifying a first timestamp in the reportcorresponding to the first personally identifying concept; means foridentifying a portion of the original audio signal corresponding to thefirst personally identifying concept by using the first timestamp; andmeans for removing the identified first portion from the original audiosignal to produce a modified audio signal, which does not include theidentified first portion.
 4. The system of claim 3, further comprising:means for transcribing the modified audio signal to produce a transcriptof the modified audio signal.